Balzan
Beyond Static Responses: Multi-Agent LLM Systems as a New Paradigm for Social Science Research
Haase, Jennifer, Pokutta, Sebastian
As large language models (LLMs) transition from static tools to fully agentic systems, their potential for transforming social science research has become increasingly evident. This paper introduces a structured framework for understanding the diverse applications of LLM-based agents, ranging from simple data processors to complex, multi-agent systems capable of simulating emergent social dynamics. By mapping this developmental continuum across six levels, the paper clarifies the technical and methodological boundaries between different agentic architectures, providing a comprehensive overview of current capabilities and future potential. It highlights how lower-tier systems streamline conventional tasks like text classification and data annotation, while higher-tier systems enable novel forms of inquiry, including the study of group dynamics, norm formation, and large-scale social processes. However, these advancements also introduce significant challenges, including issues of reproducibility, ethical oversight, and the risk of emergent biases. The paper critically examines these concerns, emphasizing the need for robust validation protocols, interdisciplinary collaboration, and standardized evaluation metrics. It argues that while LLM-based agents hold transformative potential for the social sciences, realizing this promise will require careful, context-sensitive deployment and ongoing methodological refinement. The paper concludes with a call for future research that balances technical innovation with ethical responsibility, encouraging the development of agentic systems that not only replicate but also extend the frontiers of social science, offering new insights into the complexities of human behavior.
- Europe > Germany > Berlin (0.40)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Connecticut > New Haven County > New Haven (0.04)
- (4 more...)
- Research Report > Experimental Study (1.00)
- Overview (1.00)
- Research Report > New Finding (0.92)
- Government (1.00)
- Education (1.00)
- Leisure & Entertainment > Games > Computer Games (0.92)
- (2 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.68)
The land use-climate change-biodiversity nexus in European islands stakeholders
Moustakas, Aristides, Christoforidi, Irene, Zittis, George, Demirel, Nazli, Fois, Mauro, Zotos, Savvas, Gallou, Eirini, Stamatiadou, Valentini, Tzirkalli, Elli, Zoumides, Christos, Košić, Kristina, Christopoulou, Aikaterini, Dragin, Aleksandra, Łowicki, Damian, Gil, Artur, Almeida, Bruna, Chrysos, Panos, Balzan, Mario V., Mansoldo, Mark D. C., Ólafsdóttir, Rannveig, Ayhan, Cigdem Kaptan, Atay, Lutfi, Tase, Mirela, Stojanović, Vladimir, Ladičorbić, Maja Mijatov, Díaz, Juan Pedro, Expósito, Francisco Javier, Quiroga, Sonia, Cano, Miguel Ángel Casquet, Wang, Haoran, Suárez, Cristina, Manolaki, Paraskevi, Vogiatzakis, Ioannis N.
To promote climate adaptation and mitigation, it is crucial to understand stakeholder perspectives and knowledge gaps on land use and climate changes. Stakeholders across 21 European islands were consulted on climate and land use change issues affecting ecosystem services. Climate change perceptions included temperature, precipitation, humidity, extremes, and wind. Land use change perceptions included deforestation, coastal degradation, habitat protection, renewable energy facilities, wetlands, and others. Additional concerns such as invasive species, water or energy scarcity, infrastructure problems, and austerity were also considered. Climate and land use change impact perceptions were analysed with machine learning to quantify their influence. The predominant climatic characteristic is temperature, and the predominant land use characteristic is deforestation. Water-related problems are top priorities for stakeholders. Energy-related problems, including energy deficiency and issues with wind and solar facilities, rank high as combined climate and land use risks. Stakeholders generally perceive climate change impacts on ecosystem services as negative, with natural habitat destruction and biodiversity loss identified as top issues. Land use change impacts are also negative but more complex, with more explanatory variables. Stakeholders share common perceptions on biodiversity impacts despite geographic disparity, but they differentiate between climate and land use impacts. Water, energy, and renewable energy issues pose serious concerns, requiring management measures.
- North America > The Bahamas (0.14)
- Europe > Portugal > Lisbon > Lisbon (0.14)
- Europe > Portugal > Azores (0.04)
- (32 more...)
Technological folie à deux: Feedback Loops Between AI Chatbots and Mental Illness
Dohnány, Sebastian, Kurth-Nelson, Zeb, Spens, Eleanor, Luettgau, Lennart, Reid, Alastair, Gabriel, Iason, Summerfield, Christopher, Shanahan, Murray, Nour, Matthew M
Artificial intelligence chatbots have achieved unprecedented adoption, with millions now using these systems for emotional support and companionship in contexts of widespread social isolation and capacity-constrained mental health services. While some users report psychological benefits, concerning edge cases are emerging, including reports of suicide, violence, and delusional thinking linked to perceived emotional relationships with chatbots. To understand this new risk profile we need to consider the interaction between human cognitive and emotional biases, and chatbot behavioural tendencies such as agreeableness (sycophancy) and adaptability (in-context learning). We argue that individuals with mental health conditions face increased risks of chatbot-induced belief destabilization and dependence, owing to altered belief-updating, impaired reality-testing, and social isolation. Current AI safety measures are inadequate to address these interaction-based risks. To address this emerging public health concern, we need coordinated action across clinical practice, AI development, and regulatory frameworks.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > Middle East > Malta > Northern Region > Western District > Balzan (0.04)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
- Health & Medicine > Consumer Health (1.00)
Temperature and Persona Shape LLM Agent Consensus With Minimal Accuracy Gains in Qualitative Coding
Borchers, Conrad, Shahrokhian, Bahar, Balzan, Francesco, Tajik, Elham, Sankaranarayanan, Sreecharan, Simon, Sebastian
Large Language Models (LLMs) enable new possibilities for qualitative research at scale, including coding and data annotation. While multi-agent systems (MAS) can emulate human coding workflows, their benefits over single-agent coding remain poorly understood. We conducted an experimental study of how agent persona and temperature shape consensus-building and coding accuracy of dialog segments based on a codebook with 8 codes. Our open-source MAS mirrors deductive human coding through structured agent discussion and consensus arbitration. Using six open-source LLMs (with 3 to 32 billion parameters) and 18 experimental configurations, we analyze over 77,000 coding decisions against a gold-standard dataset of human-annotated transcripts from online math tutoring sessions. Temperature significantly impacted whether and when consensus was reached across all six LLMs. MAS with multiple personas (including neutral, assertive, or empathetic), significantly delayed consensus in four out of six LLMs compared to uniform personas. In three of those LLMs, higher temperatures significantly diminished the effects of multiple personas on consensus. However, neither temperature nor persona pairing lead to robust improvements in coding accuracy. Single agents matched or outperformed MAS consensus in most conditions. Only one model (OpenHermesV2:7B) and code category showed above-chance gains from MAS deliberation when temperature was 0.5 or lower and especially when the agents included at least one assertive persona. Qualitative analysis of MAS collaboration for these configurations suggests that MAS may nonetheless aid in narrowing ambiguous code applications that could improve codebooks and human-AI coding. We contribute new insight into the limits of LLM-based qualitative methods, challenging the notion that diverse MAS personas lead to better outcomes. We open-source our MAS and experimentation code.
- North America > United States > Arizona (0.04)
- Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- (5 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Education > Curriculum > Subject-Specific Education (0.67)
- Education > Educational Setting > K-12 Education (0.46)
Co-Creative Learning via Metropolis-Hastings Interaction between Humans and AI
Okumura, Ryota, Taniguchi, Tadahiro, Taniguchi, Akira, Hagiwara, Yoshinobu
We propose co-creative learning as a novel paradigm where humans and AI, i.e., biological and artificial agents, mutually integrate their partial perceptual information and knowledge to construct shared external representations, a process we interpret as symbol emergence. Unlike traditional AI teaching based on unilateral knowledge transfer, this addresses the challenge of integrating information from inherently different modalities. We empirically test this framework using a human-AI interaction model based on the Metropolis-Hastings naming game (MHNG), a decentralized Bayesian inference mechanism. In an online experiment, 69 participants played a joint attention naming game (JA-NG) with one of three computer agent types (MH-based, always-accept, or always-reject) under partial observability. Results show that human-AI pairs with an MH-based agent significantly improved categorization accuracy through interaction and achieved stronger convergence toward a shared sign system. Furthermore, human acceptance behavior aligned closely with the MH-derived acceptance probability. These findings provide the first empirical evidence for co-creative learning emerging in human-AI dyads via MHNG-based interaction. This suggests a promising path toward symbiotic AI systems that learn with humans, rather than from them, by dynamically aligning perceptual experiences, opening a new venue for symbiotic AI alignment.
- Europe > Austria > Vienna (0.14)
- Europe > Middle East > Malta > Northern Region > Western District > Balzan (0.04)
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Synthetic Socratic Debates: Examining Persona Effects on Moral Decision and Persuasion Dynamics
Liu, Jiarui, Song, Yueqi, Xiao, Yunze, Zheng, Mingqian, Tjuatja, Lindia, Borg, Jana Schaich, Diab, Mona, Sap, Maarten
As large language models (LLMs) are increasingly used in morally sensitive domains, it is crucial to understand how persona traits affect their moral reasoning and persuasive behavior. We present the first large-scale study of multi-dimensional persona effects in AI-AI debates over real-world moral dilemmas. Using a 6-dimensional persona space (age, gender, country, class, ideology, and personality), we simulate structured debates between AI agents over 131 relationship-based cases. Our results show that personas affect initial moral stances and debate outcomes, with political ideology and personality traits exerting the strongest influence. Persuasive success varies across traits, with liberal and open personalities reaching higher consensus and win rates. While logit-based confidence grows during debates, emotional and credibility-based appeals diminish, indicating more tempered argumentation over time. These trends mirror findings from psychology and cultural studies, reinforcing the need for persona-aware evaluation frameworks for AI moral reasoning.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- South America > Brazil (0.05)
- Europe > France (0.05)
- (7 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
A Computational Model of Inclusive Pedagogy: From Understanding to Application
Balzan, Francesco, Santos, Pedro P., Gabbrielli, Maurizio, Albarracin, Mahault, Lopes, Manuel
Human education transcends mere knowledge transfer, it relies on co-adaptation dynamics -- the mutual adjustment of teaching and learning strategies between agents. Despite its centrality, computational models of co-adaptive teacher-student interactions (T-SI) remain underdeveloped. We argue that this gap impedes Educational Science in testing and scaling contextual insights across diverse settings, and limits the potential of Machine Learning systems, which struggle to emulate and adaptively support human learning processes. To address this, we present a computational T-SI model that integrates contextual insights on human education into a testable framework. We use the model to evaluate diverse T-SI strategies in a realistic synthetic classroom setting, simulating student groups with unequal access to sensory information. Results show that strategies incorporating co-adaptation principles (e.g., bidirectional agency) outperform unilateral approaches (i.e., where only the teacher or the student is active), improving the learning outcomes for all learning types. Beyond the testing and scaling of context-dependent educational insights, our model enables hypothesis generation in controlled yet adaptable environments. This work bridges non-computational theories of human education with scalable, inclusive AI in Education systems, providing a foundation for equitable technologies that dynamically adapt to learner needs.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Middle East > Malta > Northern Region > Western District > Balzan (0.04)
- (6 more...)
- Education > Educational Setting (1.00)
- Education > Curriculum (0.68)
- Education > Educational Technology > Educational Software > Computer Based Training (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
- (2 more...)
Do Chains-of-Thoughts of Large Language Models Suffer from Hallucinations, Cognitive Biases, or Phobias in Bayesian Reasoning?
Learning to reason and carefully explain arguments is central to students' cognitive, mathematical, and computational thinking development. This is particularly challenging in problems under uncertainty and in Bayesian reasoning. With the new generation of large language models (LLMs) capable of reasoning using Chain-of-Thought (CoT), there is an excellent opportunity to learn with them as they explain their reasoning through a dialogue with their artificial internal voice. It is an engaging and excellent opportunity to learn Bayesian reasoning. Furthermore, given that different LLMs sometimes arrive at opposite solutions, CoT generates opportunities for deep learning by detailed comparisons of reasonings. However, unlike humans, we found that they do not autonomously explain using ecologically valid strategies like natural frequencies, whole objects, and embodied heuristics. This is unfortunate, as these strategies help humans avoid critical mistakes and have proven pedagogical value in Bayesian reasoning. In order to overcome these biases and aid understanding and learning, we included prompts that induce LLMs to use these strategies. We found that LLMs with CoT incorporate them but not consistently. They show persistent biases towards symbolic reasoning and avoidance or phobia of ecologically valid strategies.
- North America > United States > California (0.14)
- South America > Chile (0.04)
- Europe > Middle East > Malta > Northern Region > Western District > Balzan (0.04)
- Asia > Singapore (0.04)
- Health & Medicine > Therapeutic Area (0.64)
- Education > Educational Setting > K-12 Education > Primary School (0.46)
Modeling of New Energy Vehicles' Impact on Urban Ecology Focusing on Behavior
The surging demand for new energy vehicles is driven by the imperative to conserve energy, reduce emissions, and enhance the ecological ambiance. By conducting behavioral analysis and mining usage patterns of new energy vehicles, particular patterns can be identified. For instance, overloading the battery, operating with low battery power, and driving at excessive speeds can all detrimentally affect the battery's performance. To assess the impact of such driving behavior on the urban ecology, an environmental computational modeling method has been proposed to simulate the interaction between new energy vehicles and the environment. To extend the time series data of the vehicle's entire life cycle and the ecological environment within the model sequence data, the LSTM model with Bayesian optimizer is utilized for simulation. The analysis revealed the detrimental effects of poor driving behavior on the environment.
- Asia > China > Beijing > Beijing (0.05)
- North America > United States > New York (0.04)
- North America > Canada (0.04)
- Europe > Middle East > Malta > Northern Region > Western District > Balzan (0.04)
- Transportation > Ground > Road (1.00)
- Transportation > Electric Vehicle (1.00)
- Energy > Energy Storage (1.00)
- Automobiles & Trucks (1.00)
Speculative Exploration on the Concept of Artificial Agents Conducting Autonomous Research
This paper engages in a speculative exploration of the concept of an artificial agent capable of conducting research. Initially, it examines how the act of research can be conceptually characterized, aiming to provide a starting point for discussions about what it means to create such agents. The focus then shifts to the core components of research: question formulation, hypothesis generation, and hypothesis verification. This discussion includes a consideration of the potential and challenges associated with enabling machines to autonomously perform these tasks. Subsequently, this paper briefly considers the overlapping themes and interconnections that underlie them. Finally, the paper presents preliminary thoughts on prototyping as an initial step towards uncovering the challenges involved in developing these research-capable agents.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > Middle East > Malta > Northern Region > Western District > Balzan (0.04)